Automatic Term Recognition in Polish Texts
ثبت نشده
چکیده
Although ATR has been in the research focus for over a decade now, most approaches have been developed for highly positional languages, whereas only a few efforts have been made for Slavic languages which have a richer morphological inflection and a more relaxed word order, e.g., Vintar (2004) (for Slovene) and Nenadic et al. (2003) (for Serbian). In this paper, we report on our experiments in adopting the same method which was used for the other two Slavic languages and applying it to term extraction from Polish texts.
منابع مشابه
Generating of Events Dictionaries from Polish WordNet for the Recognition of Events in Polish Documents
In this article we present the result of the recent research in the recognition of events in Polish. Event recognition plays a major role in many natural language processing applications such as question answering or automatic summarization. We adapted TimeML specification (the well known guideline for English) to Polish language. We annotated 540 documents in Polish Corpus of Wroc law Universi...
متن کاملResources for Information Extraction from Polish texts
The paper presents a collection of resources developed for Information Extraction (IE) from Polish texts. In particular, we mention two IE platforms adapted to Polish and several IE applications built on top of one of them: named entity recognition, creation of terminology lexicons, and data extraction from medical texts.
متن کاملCross-Lingual Adaptation of Broadcast Transcription System to Polish Language Using Public Data Sources
We present methods and procedures designed for cost-efficient adaptation of an existing speech recognition system to Polish. The system (originally built for Czech language) is adapted using common texts and speech recordings accessible from Polish web-pages. The most critical part, an acoustic model (AM) for Polish, is built in several steps, which include: a) an initial bootstrapping phase th...
متن کاملUnsupervised Keyword Extraction from Polish Legal Texts
In this work, we present an application of the recently proposed unsupervised keyword extraction algorithm RAKE to a corpus of Polish legal texts from the field of public procurement. RAKE is essentially a language and domain independent method. Its only languagespecific input is a stoplist containing a set of non-content words. The performance of the method heavily depends on the choice of suc...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005